Placement and routing on a  circuit

ABSTRACT

Methods and apparatuses to place and route cells on integrated circuit chips along paths is described. In one embodiment, the method to layout an integrated circuit, the method comprises routing a wire to connect a first cell of the integrated circuit and a second cell of the integrated circuit, and placing a third cell of the integrated circuit after said routing the wire to connect the first cell and the second cell.

This application is a divisional of U.S. patent application Ser. No.12/069,206, filed on Feb. 7, 2008, now U.S. Pat. No. 8,732,645 issuingMay 20, 2014, which claims benefit to U.S. patent application Ser. No.10/351,094, filed on Jan. 23, 2003, now U.S. Pat. No. 7,350,173 issuedMar. 25, 2008. This application also claims the benefit of the filingdate of provisional application Ser. No. 60/388,492, filed Jun. 11,2002, and entitled “Method and Apparatus for Placement and Routing Cellson Integrated Circuit Chips” by the inventors Roger P. Ang, Ken R.McElvain, and Kenneth S. McElvain.

FIELD

The invention relates to designing integrated circuits, and moreparticularly to incremental placement and routing cells for integratedcircuits.

BACKGROUND

For the design of digital circuits on the scale of VLSI (very largescale integration) technology, designers often employ computer-aidedtechniques. Standard languages such as Hardware Description Languages(HDLs) have been developed to describe digital circuits to aide in thedesign and simulation of complex digital circuits. Several hardwaredescription languages, such as VHDL and Verilog, have evolved asindustry standards. VHDL and Verilog are general purpose hardwaredescription languages that allow definition of a hardware model at thegate level, the register transfer level (RTL) or the behavioral levelusing abstract data types. As device technology continues to advance,various product design tools have been developed to adapt HDLs for usewith newer devices and design styles.

In designing an integrated circuit with an HDL code, the code is firstwritten and then compiled by an HDL compiler. The HDL source codedescribes at some level the circuit elements, and the compiler producesan RTL netlist from this compilation. The RTL netlist is typically atechnology independent netlist in that it is independent of thetechnology/architecture of a specific vendor's integrated circuit, suchas field programmable gate arrays (FPGA) or an application-specificintegrated circuit (ASIC). The RTL netlist corresponds to a schematicrepresentation of circuit elements (as opposed to a behavioralrepresentation). A mapping operation is then performed to convert fromthe technology independent RTL netlist to a technology specific netlistwhich can be used to create circuits in the vendor'stechnology/architecture. It is well known that FPGA vendors utilizedifferent technologies/architectures to implement logic circuits withintheir integrated circuits. Thus, the technology independent RTL netlistis mapped to create a netlist which is specific to a particular vendor'stechnology/architecture.

One operation that is often desirable in this process is to plan thelayout of a particular integrated circuit and to control timing problemsand to manage interconnections between regions of an integrated circuit.This is sometimes referred to as “floor planning.” A typical floorplanning operation divides the circuit area of an integrated circuitinto regions, sometimes called “blocks,” and then assigns logic toreside in a block. These regions may be rectangular or non-rectangular.This operation has two effects: the estimation error for the location ofthe logic is reduced from the size of the integrated circuit to the sizeof the block (which tends to reduce errors in timing estimates), and theplacement and the routing typically runs faster because as it has beenreduced from one very large problem into a series of simpler problems.

After the logic elements are placed into blocks, the cells (e.g., gatesor transistors) are placed and routed in the area for a chip. FIG. 2shows a conventional method to place and route the cells of anintegrated circuit. After operation 201 places all cells for theintegrated circuit, operation 203 routes wires between cells. Thus, theoperations of placing and routing are separated. Since the placement isperformed without actual routing, the placement of the cells is based onthe estimated routing. Once the wires are actually routed, operation 205can analyze timing accurately based on the placement and routinginformation. If operation 207 determines that the timing requirements(e.g., slack) are not satisfied, the previous design may be modified inoperation 209, before the cells are placed again in operation 201.

Slack is the difference between the desired delay and the actual(estimated or computed) delay. When the desired delay is larger than theactual delay, the slack is positive; otherwise, the slack is negative.Typically, it is necessary to make the slack positive (or close to zero)to meet the timing requirement (e.g., reducing the wire delay toincrease the slack).

Thus, the conventional method separates the phases of placement androuting. The cells (e.g., gates) of a design are fully placed (e.g.,assigned locations) before the wires are actually routed. Multipleiterations of this process may be applied but the design is typicallystill fully placed before routing is assigned (or reassigned). Becausethe wires are not routed at the same time as placement, conventionalplacement algorithms estimate the result of routing. These estimates donot account for the available information of routed wires, even if onlya small part of an already placed and routed design is being modified.

SUMMARY OF THE INVENTION

Methods and apparatuses to place and route cells on integrated circuitchips along paths are described here.

In one aspect of the invention, methods to layout an integrated circuitare based on placing and routing cells along paths. In one embodiment ofthe present invention, a method to layout an integrated circuitincluding: routing a wire to connect a first cell of the integratedcircuit and a second cell of the integrated circuit; and placing a thirdcell of the integrated circuit after the wire is routed to connect thefirst cell and the second cell. In one example, the first, second andthird cells are on a first path; and, the third cell is connected to oneof the first and second cells on the first path by only one net. Thefirst path is selected from a set of paths; and, the first and secondcells are placed before the wire is routed to connect the first cell andthe second cell. In one example, timing is analyzed using the route ofthe wire connecting the first cell and the second cell to generate firsttiming information; and, a second path is selected from the set of pathsfrom a timing analysis using the first timing information, before thecells of the second path is placed. In one example, it is determinedwhether or not the third cell is previously placed; and the third cellis relocated in response to a determination that: a) the third cell ispreviously placed on a third path, b) the third cell is either aconverging point or a diverging point of the first path and the thirdpath, and, c) the third cell has positive slack. In one example, wiredelays for placing the third cell at a plurality of locations aredetermined; and, a first location is selected for the third cell fromthe plurality of locations according to timing based on the wire delays.In one example, the first location results in the lowest routingcongestion and slack larger than a threshold for the third cell amongthe plurality of locations; in another example, the first locationresults in the largest slack for the third cell among the plurality oflocations.

In one embodiment of the present invention, a method to layout anintegrated circuit includes: grouping cells in paths; and placing thepaths one after another. In one example, a first set of cells of a firstpath are determined; a second set of cells of a second path aredetermined; the second set of cells are placed after the first set ofcells are placed. In one example, the first and second path containscommon cells; and, both the first and second sets contain a third set ofcells. In one example, the third set of cells are not repositioned whenplacing the second set of cells, since they are already placed inplacing the first set of cells. In one example, a cell at a convergingpoint or a diverging point of the first and second paths may berepositioned when placing the second set of cells (e.g., when the cellat the converging or diverging point has a positive slack). In oneexample, the nets of the first path are routed before the second set ofcells are placed (e.g., routing the nets of the first path while placingthe first set of cells of the first path). In one example, paths thatare more critical in timing are placed before the paths that are lesscritical in timing. For example, it is determined whether or not thesecond path is more critical in timing than a third path; and, thesecond set of cells are placed before the cells of the third path if thesecond path is more critical in timing than the third path. A route of awire, which is previously routed, is used in determining a timingparameter for determining whether or not the second path is morecritical in timing than the third path. Similarly, the first set ofcells are placed before the second set of cells if the first path ismore critical in timing than the second path; and, the routes of wires,which are previously routed, are used in determining timing parametersfor determining whether or not the first path is more critical in timingthan the second path. A list of paths to be placed can be sortedaccording to a timing parameter; and, the paths are placed sequentiallyaccording to the list. When the routes of the wires are not available(e.g., when the wires are not routed), estimates are made in evaluatingthe timing parameter. The list of paths is updated according to updatedtiming parameters after some of the paths are placed and routed in oneexample. In one example, at least a portion of the first set of cells isplaced one cell after another along the first path in a direction (e.g.,a direction from a source of the first path toward a destination of thefirst path; or, a direction from the destination toward the source). Inone example, a first portion of the first set of cells is placed onecell after another along the first path in a direction from a source ofthe first path toward a destination of the first path; and, a secondportion of the first set of cells is placed one cell after another alongthe first path in a direction from the destination toward the source. Apath splitting net is used to divide the first path into the first andsecond portions; and, the path splitting net is selected based on itsdrive strength. In one example, the net driven by a strong driver isselected as a path splitting net. In one example, the first and secondpaths are within a portion of the integrated circuit; and, the cellswithin the portion of the integrated circuit are grouped in paths forplacing and routing the portion of the integrated circuit (e.g., inmodifying a portion of a design).

In one embodiment of the present invention, a method to layout anintegrated circuit includes: placing a first cell at a first location,at which the first cell overlaps with a portion of a second cell that isplaced at a second location before the first cell is place; and movingthe second cell from the second location to a third location to reduceoverlapping (e.g., to eliminate overlapping) with the first cell placedat the first location. In one example, the illegal placement of thefirst cell with overlapping is allowed when the first cell is largerthan the second cell. The second location may coincide with the firstlocation; and, in one example, the first location is determined fromoptimizing a design goal, which is improved when an area of overlappingbetween the first and second cells is reduced. In one example, theillegal placement is generated in increasing the size of the first cell;in another example, the illegal placement is generated in inserting thefirst cell to buffer a signal.

In one embodiment of the present invention, a method to layout anintegrated circuit includes: evaluating a first timing parameter for acell of the integrated circuit at a first location; evaluating a secondtiming parameter for the cell at a second location; and placing the cellat a selected one of the first and second locations according to thefirst and second timing parameters. At least one of the first and secondtiming parameters is evaluated based on a route of a net that ispreviously routed. The net is on a path on which the cell is located;and, the net is connected to the cell on the path in one example. In oneexample, a first congestion indicator is evaluated for the cell at thefirst location; a second congestion indicator is evaluated for the cellat the second location; and, the selected one of the first and secondlocations is determined from the first and second congestion indicatorswhen the first and second timing parameters are better than a threshold.In one example, the cell is not relocated if the cell is previouslyplaced and if the cell is neither a converging point nor a divergingpoint of two paths. In one example, the cell is on a first path; and,the selected one of the first and second locations is determined fromoptimizing a design goal which is improved when a distance between alocation for placing the cell and a destination of the first path isreduced.

In one embodiment of the present invention, a method to layout anintegrated circuit includes: determining a plurality of nets of a path;generating a plurality of placement designs; and selecting a firstdesign from the plurality of placement designs. Each of the placementdesigns is generated from: placing cells of a first segment of the pathnear a first location; and placing cells of a second segment of the pathnear a second location. The first segment and the second segment areconnected by one of the plurality of nets. In one example, at least oneof the nets of the path is routed for each of the placement designs;and, the first design is selected based on routes of the nets routed foreach of the placement designs. In one example, the plurality of nets aredetermined according to drive strength of corresponding nets; and, netsdriven by strong drivers are selected as the plurality of nets in oneexample. In one example, it is determined whether or not the firstdesign has a long wire driven by a weak driver. When the first designhas a long wire driven a weak drive, the driver is resized to improvethe timing for the path; alternatively, a buffer is inserted to improvethe timing for the path. In one example, the illegal placement of theresized driver or the inserted buffer is tolerated when overlappingoccurs; and, overlapping is eliminated in subsequent operations.

The present invention includes apparatuses which perform these methods,including data processing systems which perform these methods andcomputer readable media which when executed on data processing systemscause the systems to perform these methods.

Other features of the present invention will be apparent from theaccompanying drawings and from the detailed description which follow.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example and notlimitation in the figures of the accompanying drawings in which likereferences indicate similar elements.

FIG. 1 shows a block diagram example of a data processing system whichmay be used with the present invention.

FIG. 2 shows a conventional method to place and route the cells of anintegrated circuit.

FIG. 3 shows a method to place and route the cells of an integratedcircuit according to one embodiment of the present invention.

FIG. 4 shows a detailed method to place and route the cells of anintegrated circuit according to one embodiment of the present invention.

FIG. 5 shows a method to place and route the cells of an integratedcircuit based on paths according to one embodiment of the presentinvention.

FIG. 6 shows a method to place and route the cells on a path accordingto one embodiment of the present invention.

FIG. 7 shows an example to place and route the cells on a path in areasnear the source and destination locations according to one embodiment ofthe present invention.

FIG. 8 shows an example to place and route the cells on a path inclusters according to one embodiment of the present invention.

FIG. 9 shows an example to place and route the cells that are shared bytwo paths according to one embodiment of the present invention.

DETAILED DESCRIPTION

The following description and drawings are illustrative of the inventionand are not to be construed as limiting the invention. Numerous specificdetails are described to provide a thorough understanding of the presentinvention. However, in certain instances, well known or conventionaldetails are not described in order to avoid obscuring the description ofthe present invention.

Many of the methods of the present invention may be performed with adigital processing system, such as a conventional, general purposecomputer system. Special purpose computers which are designed orprogrammed to perform only one function may also be used.

FIG. 1 shows one example of a typical computer system which may be usedwith the present invention. Note that while FIG. 1 illustrates variouscomponents of a computer system, it is not intended to represent anyparticular architecture or manner of interconnecting the components assuch details are not germane to the present invention. It will also beappreciated that network computers and other data processing systemswhich have fewer components or perhaps more components may also be usedwith the present invention. The computer system of FIG. 1 may, forexample, be an Apple Macintosh computer.

As shown in FIG. 1, the computer system 101, which is a form of a dataprocessing system, includes a bus 102 which is coupled to amicroprocessor 103 and a ROM 107 and volatile RAM 105 and a non-volatilememory 106. The microprocessor 103, which may be a G3 or G4microprocessor from Motorola, Inc. or IBM is coupled to cache memory 104as shown in the example of FIG. 1. The bus 102 interconnects thesevarious components together and also interconnects these components 103,107, 105, and 106 to a display controller and display device 108 and toperipheral devices such as input/output (I/O) devices which may be mice,keyboards, modems, network interfaces, printers, scanners, video camerasand other devices which are well known in the art. Typically, theinput/output devices 110 are coupled to the system through input/outputcontrollers 109. The volatile RAM 105 is typically implemented asdynamic RAM (DRAM) which requires power continually in order to refreshor maintain the data in the memory. The non-volatile memory 106 istypically a magnetic hard drive or a magnetic optical drive or anoptical drive or a DVD RAM or other type of memory systems whichmaintain data even after power is removed from the system. Typically,the non-volatile memory will also be a random access memory althoughthis is not required. While FIG. 1 shows that the non-volatile memory isa local device coupled directly to the rest of the components in thedata processing system, it will be appreciated that the presentinvention may utilize a non-volatile memory which is remote from thesystem, such as a network storage device which is coupled to the dataprocessing system through a network interface such as a modem orEthernet interface. The bus 102 may include one or more buses connectedto each other through various bridges, controllers and/or adapters as iswell known in the art. In one embodiment the I/O controller 109 includesa USB (Universal Serial Bus) adapter for controlling USB peripherals,and/or an IEEE-1394 bus adapter for controlling IEEE-1394 peripherals.

It will be apparent from this description that aspects of the presentinvention may be embodied, at least in part, in software. That is, thetechniques may be carried out in a computer system or other dataprocessing system in response to its processor, such as amicroprocessor, executing sequences of instructions contained in amemory, such as ROM 107, volatile RAM 105, non-volatile memory 106,cache 104 or a remote storage device. In various embodiments, hardwiredcircuitry may be used in combination with software instructions toimplement the present invention. Thus, the techniques are not limited toany specific combination of hardware circuitry and software nor to anyparticular source for the instructions executed by the data processingsystem. In addition, throughout this description, various functions andoperations are described as being performed by or caused by softwarecode to simplify description. However, those skilled in the art willrecognize what is meant by such expressions is that the functions resultfrom execution of the code by a processor, such as the microprocessor103.

A machine readable medium can be used to store software and data whichwhen executed by a data processing system causes the system to performvarious methods of the present invention. This executable software anddata may be stored in various places including for example ROM 107,volatile RAM 105, non-volatile memory 106 and/or cache 104 as shown inFIG. 1. Portions of this software and/or data may be stored in any oneof these storage devices.

Thus, a machine readable medium includes any mechanism that provides(i.e., stores and/or transmits) information in a form accessible by amachine (e.g., a computer, network device, personal digital assistant,manufacturing tool, any device with a set of one or more processors,etc.). For example, a machine readable medium includesrecordable/non-recordable media (e.g., read only memory (ROM); randomaccess memory (RAM); magnetic disk storage media; optical storage media;flash memory devices; etc.), as well as electrical, optical, acousticalor other forms of propagated signals (e.g., carrier waves, infraredsignals, digital signals, etc.); etc.

At least one embodiment of the present invention seeks to incrementallyplace and route cells along timing critical paths with extensions toplace and route subsections of paths and optimize placement of pathdivergence/convergence points.

FIG. 3 shows a method to place and route the cells of an integratedcircuit according to one embodiment of the present invention. Afteroperation 301 places a portion (e.g., one cell) of the cells of anintegrated circuit (e.g., the cells on a path, or in a cluster of cells)at permissible locations, operation 303 routes the portion of the cells.A permissible location is a location that has an area for the cell.Other conditions (e.g., being restricted to a specific region on a chip,the type of the cell, or others) may also be used in defining apermissible location to reduce the number of alternative permissiblelocations. Operation 305 selects one location from the permissiblelocations that results in a best design goal (e.g., larger slack, lessrouting congestion and others) based on detailed placement and routinginformation. For cells that have not been placed or nets that have notbeen routed, estimations are used in evaluating the design goals for thepermissible locations; for cells that have been placed and nets thathave been routed, detailed placement and routing information is used tocompute the design goal, which typically depends, at least partially, ona timing parameter computed based on the actual or estimated placementand routing information. Once the best location for placing the portionof cells is determined, the result for routing the portion of the cellsat the best location is kept. Alternatively, operation 303 may onlyestimate the routing results for nets connected to the portion of thecells; after operation 305 determines the best location based on theestimated routing results, actual routing for the portion of the cellsplaced at the best location is performed (before other portions of thecells are placed and routed). If operation 307 determines that not allcells are placed and routed (or, if optimizations or modifications to alocal region are necessary), operations 301-305 can be repeated foranother portion of the cells of the integrated circuit.

FIG. 4 shows a detailed method to place and route the cells of anintegrated circuit according to one embodiment of the present invention.After operation 401 identifies a set of cells in a given netlist (e.g.,cells on a critical timing path, or a cluster of cells with high routingcongestion), the set of cells are split into segments according to thedrive strength of the drivers of the segments in operation 403. One ormore strong drivers may be selected to split the set of cells. Afteroperation 405 places and routes each of the segments at permissiblelocations for the segments, operation 407 selects one best solution fromplacing the segments at the permissible locations, which results in abest design goal (e.g., larger slack, less routing congestion andothers). For example, cells for each segment can be placed and routednear a clustered area at a permissible location. Long wires may be usedto connect the segments. From different cluster locations, the bestsolution can be selected from optimizing a design goal. Since theoverall wire length may be minimized by placing the clustered areas on aline on the source and the destination of the path, the permissiblelocations for the segments of the path may be selected from locationsnear the line passing the source and destination of the path. Operations401-407 are repeated until operation 409 determines that all sets ofcells are processed.

FIG. 5 shows a method to place and route the cells of an integratedcircuit based on paths according to one embodiment of the presentinvention. Operation 501 groups cells by paths. Operation 503 analyzestiming to identify a set of cells on a critical timing path byaccounting for already placed and routed cells (e.g., selecting a pathwith the lowest slack from paths that have not been placed and routed).When detailed information for the placement of a cell or routing a net(logic wire) is not available (since the cell has not been placed or thenet has not been routed), estimations are used in the timing analysis ofoperation 503; otherwise, detailed placement and routing information isused. Operation 505 splits the path into segments at a set of nets thatare connected to the corresponding drivers with strong drive strength.In one embodiment of the present invention, operation 505 is notperformed; and, the entire path is placed and routed one cell afteranother along the path from the source to the destination (or backwardfrom the destination to the source). Operation 507 places and routes thesegment on the beginning part of the path from the source toward thedestination of the path; and, operation 509 places and routes thesegment on the ending part of the path backward from the destinationtoward the source of the path. If there is a segment with a seed cellthat has already been placed (e.g., placed and routed when processing aprevious path), operation 511 places and routes the segment from theseed cells toward the line between the source and the destination of thepath when possible. If there are other segments, operation 513 placesand routes these segments close to the line between the source anddestination of the path in clusters. Long wires may be used to connectthe segments of the path. A number of different designs (e.g., differentnumber of segments at different cluster locations, or segments separatedat a different set of nets) may be evaluated to select a solution thatoptimizes a design goal. After all the segments of the path are placedand routed (and the best design is selected), operation 503 may beperformed again to identify the next path to be placed and routed.Alternatively, a timing analysis is performed to sort the list of pathsaccording to the slack of the paths; and, the list of paths areprocessed sequentially. The order of the list of remaining paths may bechanged after one or more paths are placed and routed, when detailedplacement and routing information for the accurate assessment of thetiming parameters is available.

Although operations 403 and 505 suggest that the cells are split at netsdriven by strong drivers, it is not necessary to split the cells atstrong drivers. Any net on a path may be used to split a path. Selectinga set of nets driven by strong drivers limits the number of differentchoices for splitting the path, resulting in runtime savings. In oneembodiment of the present invention, a set of alternative path splittingnets driven by strong drivers is selected. The path is placed once foreach of the splitting nets, which splits the path into two segments forplacing near the source and the destination locations; and, the pathplacement for one of the splitting nets with the best timing (or otherdesign goal) is selected. However, in the event that the best pathplacement from the set of splitting nets still results in a long wiredriven by a weak driver, all nets in the path are considered assplitting nets to select a best splitting net. If the best pathplacement for a net selected from all nets on the path as splitting netsstill results in a long wire driven by a weak driver, sizing/buffering(e.g., in place optimization) of such nets can be considered. Suchsizing/buffering may result in an illegal placement for the currentiteration, which will be fixed in following iterations.

In one embodiment of the present invention, illegal placements may betolerated during iterations; and, more than one iterations may be usedto reach a final solution without illegal placements. Placers in generaloperate best when sizes of the objects to be placed are fairly uniform;and, difficulties arise when placing a mix of large and small objects.Traditional placers for FPGA designs typically provide suboptimalresults with objects of mixed sizes; and, traditional placers for ASICdesigns use methodologies where objects larger than standard cells suchas memories are excluded from the standard cell area, which increasesthe wire lengths to and from those larger cells.

To handle a mix of small, medium and large cells, a method according toone embodiment of the present invention splits the placement iterationinto two or more phases. The first phase allows illegal overlappingplacements of objects (e.g., large objects overlapping with previouslyplaced small objects). The amount of overlap of a large object withpreviously placed small objects is taken as part of the cost functionfor selecting the best placement. In a second phase of the iteration,the placement of the larger objects from the first phase is locked down;and, only the smaller objects that overlap with the larger objects inthe first phase are relocated so that they are not placed on top of thelarger objects. When there is a very large range of object sizes, morethan two placement phases per iteration could be used to eventuallyeliminate all illegal placements.

FIG. 6 shows a method to place and route the cells on a path accordingto one embodiment of the present invention. Operation 601 places a cell(e.g., the beginning cell or the ending cell) on a path at a permissiblelocation. Operation 603 routes the cell from the permissible location.Operation 605 determines a timing parameter (e.g., slack) and acongestion score for the cell. Operations 601-605 are repeated untiloperation 607 determines that all permissible locations for the cell areevaluated. Then, operation 609 selects one from the permissiblelocations which has the lowest congestion score and which has a timingparameter (e.g., slack) that is above a threshold. When none of thepermissible locations has a timing parameter that is above thethreshold, the location with the best timing parameter is selected. Inone embodiment of the present invention, the distance to the destination(or source location if placing backward) is also considered in selectinga best location. Since different paths may share cells, a cell on alater placed path may be already placed in processing an earlier placedpath. If operation 609 determines that the next cell is not alreadyplaced and routed, operation 621 starts to process the next cell;otherwise, operation 613 determines whether or not the next cell is aconverging or diverging point of two paths. If the next cell is not theconverging or diverging point of two paths, operation 617 skips the nextcell to process the cell after the next cell on the path; otherwise,operation 615 determines whether or not the slack of the next cell ispositive. If the next cell has positive slack, the next cell may berelocated to a permissible location that maintains positive slack forit; otherwise, the next cell is skipped by operation 617 to preserve theprevious allocated location for it.

The foregoing description of the methods in FIGS. 3-6 assumes aparticular process flow in which certain operations or actions followother operations or actions. It will be appreciated that alternativeflows may also be practiced with the present invention. Otheralternative sequences of operations may be envisioned by those skilledin the art.

FIG. 7 shows an example to place and route the cells on a path in areasnear the source and destination locations according to one embodiment ofthe present invention. Consider that the drive strength of cell 715 isthe strongest on the path from source 701 to destination 703. Thus, thepath is split at the net (logic wire) connecting cells 715 and 721. Thesegment of the path from cell 711 to cell 715 may be placed in cluster710 near source 701; and, the segment of path from cell 721 to cell 725may be placed in cluster 720 near destination 703. To achieve suchclustering, the segment near source 701 is placed starting with cell 711and ending with cell 715; and, the segment near destination 703 isplaced backward, starting with cell 725 and ending with cell 721. Thesegment in cluster 710 is placed toward destination 703; and, thesegment in cluster 720 is placed toward source 701. However, it is notnecessary to split the path at the net after the strongest driver. Anumber of different nets may be selected to split the path in a numberof different ways; and, the best result is selected from the differentdesigns (e.g., splitting and placing the path in a number of differentways).

FIG. 8 shows an example to place and route the cells on a path inclusters according to one embodiment of the present invention. A pathmay be broken into a number segments at a set of nets connected todrivers of strong strength. Consider that drivers 815 and 825 have thestrongest driving strength. The segments of the cells can be placed inthree clusters 810, 820, and 830 near the line on source 801 anddestination 803. Similar to the segments in clusters 710 and 720 in FIG.7, the segments in clusters 810 and 830 are placed close to the sourceand the destination of the path. The segment in cluster 810 is placedforward from cell 811 to cell 815; and, the segment in cluster 830 isplaced backward from cell 835 to cell 831. The path from cell 821 tocell 825 is placed in cluster 820 near the line on source 801 anddestination 803. Long wires 841 and 843 are used to connect thesegments. The location of cluster 820 may be determined from a seed cellin the segment, which is already placed (e.g., placed and routed whenprocessing a previous path). A number of permissible locations may beevaluated before a best location for placing the cluster is selected. Adifferent number of segments of the path may be generated from adifferent number of splitting nets to generate different designs forplacing the path; and, a best solution can be selected from optimizing acost function.

FIG. 9 shows an example to place and route the cells that are shared bytwo paths according to one embodiment of the present invention. Considerthat path 910 of cells 901, 903, 905, 907 and 909 is more critical intiming than path 920 of cells 911, 903, 905, 907, and 913. The two pathsshare cells 903, 905 and 907; cell 903 is the converging point of thetwo paths; and, cell 907 is the diverging point of the two paths. Path920 is placed and routed after path 910 is placed and routed. After cell911 is placed and routed, cell 903 is checked. Since cell 903 is alreadyplaced and is a converging point of the two paths, cell 903 may berelocated if it has positive slack. When cell 903 is relocated, it mustbe relocated to a location so that cell 903 remains to have positiveslack. If it has a negative slack or it cannot be relocated to anotherlocation to maintain positive slack, the previously assigned locationfor cell 903 is not changed. Cell 905 is also already placed (e.g., whenpath 910 is placed and routed). However, cell 905 is neither aconverging point nor a diverging point. Thus, the location of cell 905is not changed. Similar to cell 903, cell 907 is a diverging point; cell907 may be relocated if it has positive slack before and afterrelocation. Thus, re-allocation of resources is performed for a cellthat is a divergence/convergence point of critical paths, if, for thecurrent allocation, the cell has positive slack and the new resourceallocation maintains positive slack at the cell. This restriction isused to reduce the number of cells that are repeatedly evaluated as wellas to maintain as much of the placement of more critical paths.

At least one embodiment of the present invention seeks to assigncell/gate resources and routing resources for the cells/gates and netsin a given netlist in several iterations over the design. At the startof each of the iterations, timing analysis is done to compute thecritical timing paths by comparing propagation delays to timingconstraints (e.g., slack). The timing analysis computes delays byaccounting for the existing assignment of cell and routing resources andusing the pre-determined delays for those resources to calculate timingalong paths (connected cells) in the netlist. For the initial timinganalysis, a default allocation of resources is assumed where all netshave identical delay. The cells are grouped by path. A given cell can bein more than one path. The paths are ordered by slack, where paths withthe lowest slack are considered first. The path ordering can be keptfixed during an iteration to save runtime, or dynamically updated asplacement of the paths proceeds. A new allocation of cell and routingresources is then determined on a path basis.

When assigning resources to a path, the path is examined starting withthe cell that originates the path (e.g., the output of a sequential cellor an input pad/port cell) and proceeding to the next connected cell inthe path. For each cell in a path, the valid locations are examined. Alocation is valid if is its cell resources are not yet assigned and thecell is legal to place there. Legality of the cell can, for example,depend on the type of cell or routing resources at the location, or theavailable area at the location. For each valid location, timing analysisis done assuming that the given cell is assigned to cell resources atthe location and that fastest available routing is assigned to the netsconnected to the given cell. Also, wire lengths for all nets connectedto the cell are used to calculate an estimated congestion score for thecell. All other cells and nets are assumed to either keep the resourcesassigned to them in the current iteration or the resources assigned fromthe previous iteration. After all valid locations are considered, thegiven cell is assigned the location and routing resources that yield thehighest slack for that cell. If the slack for possible locations isabove a given positive threshold, the location with the lowestcongestion score is chosen to reduce possible routing congestion. Sincea cell can belong to more than one path, during processing of a paththat has not yet been assigned resources, a cell that is already placedand routed for the current iteration can be encountered. In such a case,the existing resource assignments of the cell are kept and processing ofthe path continues to the next cell that is not yet assigned resources.Alternatively, an exception is made to cells at converging and divergingpoints of paths, which may be relocated when they have positive slack(or slack that is above a threshold value).

One embodiment of the present invention picks a set of alternative pathsplitting nets (cell output wires along the path). For each splittingnet, the beginning of the path up to the splitting net is placed as inthe preceding paragraph; and, the end of the path is placed backwardsfrom the end of the path. One splitting net from the set of alternativesplitting nets that yields the best placement and routing result isselected. In some technologies such as ASIC, the fastest placement for apath will cluster a set of cells near the source, followed by a longwire driven by the strongest driver in the path, followed by a clusternear the destination. This will be the fastest placement assuming thatwire resistance is not significant. It is not always possible to achievethis partitioning of a path because of resource limitations near eitherthe source or destination of the path. In one embodiment of the presentinvention, the set of splitting nets are chosen according to the drivestrength of their driver; in another embodiment of the presentinvention, all nets on the path are selected as the set of alternativesplitting nets.

While placing a cell in the path, the eventual destination of the pathcan be used to determine a wire length bonus for approaching theeventual destination. The wire length bonus can be used to break tiesbetween otherwise equivalent placement options for a cell. In oneembodiment of the present invention, the wire length bonus is also usedin determining the cost function (or design goal) for optimization.

If a path is long enough such that resistance is significant, the idealplacement may be of several clumps of cells along the line between thesource and destination of the path. The splitting nets as definedpreviously ideally divide the clumps. A located seed for each clump canbe found by finding a cell in each clump that has connections topreviously placed cells for the placement iteration. In each case thesize of the clump is checked against locally available resources.

As cells are placed along a path, either an actual route or a routeestimate is performed for the connected nets based on the placement.This is not as feasible in traditional placement methods in which a muchlarger number of placements for a cell are considered. With a goodglobal placement (such as generated by a quadratic placer) the number ofplacement iterations will usually be less than 10. This makes itaffordable from a CPU time perspective to do routing during placementeven for very large problems.

Global timing can be updated for each of the placement iterations, afterthe placement of a path, or after each cell in a path is assigned alocation. Early iterations use less frequent updates to save CPU time.

At the end of each of the iterations, all cells and nets have beenassigned resources; otherwise, the design is not feasible with the givenresources. The current assignment of resources is evaluated. The worst(lowest) slack on any path is used to score the current solution. Forall solutions, the one with the best score is kept. If two solutionshave the same score, a second scoring function based on the 10 worstpath slacks is used to determine which to keep.

In one embodiment of the present invention, resources are assigned forsmall, specific clusters (windows) of cells, instead of entire timingpaths. The criteria to be optimized during placement and routing canalso be used to select the clusters. For example, if the goal is tooptimize timing delay, the selected group of cells will be a subsectionof a critical timing path and the neighboring connected cells. Ifrouting congestion is to be reduced, the cells can be selected fromareas with high routing congestion. Also, the area for allocation can bereduced to a subspace of the entire available chip. For example,allocation of new cell and routing resources for a cluster can berestricted to a rectangular area that is the bounding box for thecurrent allocation of the cells. The cells in the window can be groupedin paths starting at the inputs of the window (input nets that cross thewindow boundary) and end at the outputs of the window (output nets thatcross the window boundary). The list of paths are sorted so that themost timing critical path is placed first. The first cell to place willbe the start of the most timing critical path; and, the next cell toplace will be the next cell along the same path. Thus, instead of apath-based allocation across the entire available chip area, a clusterbased allocation across a sub-area of the available chip can beperformed to optimize local areas of the design with enhanced runtimeand practical problem sizes.

At least one embodiment of the present invention simultaneouslyallocates cell and routing resources (places and routes each gateseparately); and, the algorithm incrementally places and routes anindividual path or cluster of cells, while preserving resourceallocation of the rest of the design. Implementation of one embodimentof the present invention for Xilinx Virtex/VirtexE FPGAs produced anaverage clock period improvement of 10% on a set of benchmark designs.

While most embodiments of the present invention are intended for use inan HDL design synthesis software program, the invention is notnecessarily limited to such use. Although use of other languages andcomputer programs is possible (e.g. a computer program may be written todescribe hardware and thus be considered an expression in an HDL and maybe compiled or the invention, in some embodiments, may allocate andreallocate a logic representation, e.g. a netlist, which was createdwithout the use of an HDL), embodiments of the present invention will bedescribed in the context of use in HDL synthesis systems, andparticularly those designed for use with integrated circuits which havevendor-specific technology/architectures. As is well known, the targetarchitecture is typically determined by a supplier of programmable ICs.An example of a target architecture is the programmable lookup tables(LUTs) and associated logic of the integrated circuits which are fieldprogrammable gate arrays from Xilinx, Inc. of San Jose, Calif. Otherexamples of target architecture/technology include those well knownarchitectures in field programmable gate arrays and complex programmablelogic devices from vendors such as Altera, Lucent Technology, AdvancedMicro Devices, and Lattice Semiconductor. For certain embodiments, thepresent invention may also be employed with application-specificintegrated circuits (ASICs).

In the foregoing specification, the invention has been described withreference to specific exemplary embodiments thereof. It will be evidentthat various modifications may be made thereto without departing fromthe broader spirit and scope of the invention as set forth in thefollowing claims. The specification and drawings are, accordingly, to beregarded in an illustrative sense rather than a restrictive sense.

We claim:
 1. A method to layout an integrated circuit, the methodcomprising: routing a wire to connect a first cell of the integratedcircuit and a second cell of the integrated circuit; and placing a thirdcell of the integrated circuit after said routing the wire to connectthe first cell and the second cell.
 2. The method of claim 1, whereinthe first, the second and the third cells are on a first path.
 3. Themethod of claim 2, wherein the third cell is connected to one of thefirst and second cells on the first path by only one net.
 4. The methodof claim 3, further comprising: selecting the first path from a set ofpaths; and placing the first and second cells before the routing thewire to connect the first cell and the second cell.
 5. The method ofclaim 4, further comprising: analyzing timing using a route of the wireconnecting the first cell and the second cell to generate first timinginformation; selecting a second path from the set of paths from a timinganalysis using the first timing information; and placing a cell on thesecond path.
 6. The method of claim 3, wherein the placing of the thirdcell comprises: determining whether or not the third cell is previouslyplaced; and relocating the third cell in response to a determinationthat: the third cell is previously placed on a third path, the thirdcell is one of a converging point and a diverging point of the firstpath and the third path, and the third cell has positive slack.
 7. Themethod of claim 1, wherein the placing of the third cell comprises:determining wire delays for placing the third cell at a plurality oflocations; and selecting a first location from the plurality oflocations for the third cell according to timing based on the wiredelays.
 8. The method of claim 7, wherein the first location results inone or more of: a lowest routing congestion and slack larger than athreshold, for the third cell among the plurality of locations; and alargest slack for the third cell among the plurality of locations.
 9. Amachine readable medium containing executable computer programinstructions which when executed by a digital processing system causesaid system to perform a method to layout an integrated circuit, themethod comprising: routing a wire to connect a first cell of theintegrated circuit and a second cell of the integrated circuit; andplacing a third cell of the integrated circuit after said routing thewire to connect the first cell and the second cell.
 10. The medium ofclaim 9, wherein the first, the second and the third cells are on afirst path.
 11. The medium of claim 10, wherein the third cell isconnected to one of the first and second cells on the first path by onlyone net.
 12. The medium of claim 11, wherein the method furthercomprises: selecting the first path from a set of paths; and placing thefirst and second cells before said routing the wire to connect the firstcell and the second cell.
 13. The medium of claim 12, wherein the methodfurther comprises: analyzing timing using a route of the wire connectingthe first cell and the second cell to generate first timing information;selecting a second path from the set of paths from a timing analysisusing the first timing information; and placing a cell on the secondpath.
 14. The medium of claim 11, wherein said placing the third cellcomprises: determining whether or not the third cell is previouslyplaced; and relocating the third cell in response to a determinationthat: the third cell is previously placed on a third path, the thirdcell is one of a converging point and a diverging point of the firstpath and the third path, and the third cell has positive slack.
 15. Themedium of claim 19, wherein said placing the third cell comprises:determining wire delays for placing the third cell at a plurality oflocations; and selecting a first location from the plurality oflocations for the third cell according to timing based on the wiredelays.
 16. The medium of claim 15, wherein the first location resultsin one or more of: a lowest routing congestion and a slack larger than athreshold for the third cell among the plurality of locations, and alargest slack for the third cell among the plurality of locations.
 17. Adigital processing system to layout an integrated circuit, the digitalprocessing system comprising: means for routing a wire to connect afirst cell of the integrated circuit and a second cell of the integratedcircuit; and means for placing a third cell of the integrated circuitafter the wire is routed to connect the first cell and the second cell.18. The digital processing system of claim 17, wherein the first, secondand third cells are on a first path, the system further comprising:means for selecting the first path from a set of paths; and means forplacing the first and second cells before the wire is routed to connectthe first cell and the second cell.
 19. The digital processing system ofclaim 18, further comprising: means for analyzing timing using a routeof the wire connecting the first cell and the second cell to generatefirst timing information; means for selecting a second path from the setof paths from a timing analysis using the first timing information; andmeans for placing a cell on the second path.
 20. The digital processingsystem of claim 19, wherein the means for placing the third cellcomprises one or more of: means for determining whether or not the thirdcell is previously placed, and means for relocating the third cell inresponse to a determination that: the third cell is previously placed ona third path, the third cell is one of a converging point and adiverging point of the first path and the third path, and the third cellhas positive slack; and means for determining wire delays for placingthe third cell at a plurality of locations and means for selecting afirst location from the plurality of locations for the third cellaccording to timing based on the wire delays.
 21. The digital processingsystem of claim 20 wherein the first location results in one or more of:a lowest routing congestion and a slack larger than a threshold for thethird cell among the plurality of locations; and a largest slack for thethird cell among the plurality of locations.